Blind speech separation of moving speakers using hybrid neural networks

نویسندگان

Athanasios Koutras

Evangelos Dermatas

George K. Kokkinakis

چکیده

In this paper we present a novel method for Blind Speech Separation of convolutive speech signals of moving speakers in highly reverberant rooms. The separation network used is a hybrid neural network, which performs separation of convolutive speech mixtures in the time domain, without any prior knowledge of the propagation media, based on the Maximum Likelihood Estimation (MLE) principle. The proposed method improves significantly (more than 13% in all adverse mixing situations) the performance of a phonemebased continuous speech recognition system and therefore can be used as a front-end to separate simultaneous speech of speakers who are moving in reverberant rooms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Blind speech separation of moving speakers in real reverberant environments

In this paper we present a new on-line Blind Signal Separation method capable to separate convolutive speech signals of moving speakers in highly reverberant rooms. The separation network used is a recurrent network which performs separation of convolutive speech mixtures in the time domain, without any prior knowledge of the propagation media, based on the Maximum Likelihood Estimation (MLE) p...

متن کامل

Adaptive Speech Separation Using Hybrid Approach

A hybrid iterative learning algorithm for recurrent neural networks based on higher-order statistics to blind signal separation is presented in this paper. Fourth-order statistics are used as the separation criterion to train an RNN to perform the separation. Some simulation results for both artificially convoluted audio signals and real recordings demonstrate that the proposed approach is prom...

متن کامل

Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models

In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل